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I.  Introduction 

We  consider  here  the  following  problem:  a  student,  with  a  limited 
time  budget,  must  study  for  an  examination.  The  examination  will  consist 
of  several  questions,  one  from  each  of  several  fields.  The  student  will 
be  successful  (pass  the  exam)  if  he  answers  a  majority  of  the  questions 
correctly.  The  problem  is  to  decide  how  much  time  to  spend  on  each  of 
the  several  fields. 

Mathematically,  we  assume  the  subject  is  divided  into  n  fields. 
For  i=l,  ,  n  we  assume  a  function 

Pi  =  f  i  (x-j ) 

gives  the  probability  that  the  question  on  the  i — field  will  be 
correctly  answered  if  the  student  spends  x.  units  of  time  on  that 
field.  For  obvious  reasons,  we  shall  assume  each  f,  is  monotone 
non-decreasing  and  continuous,  and  bounded  below  by  0  and  above  by  1. 
Let  N  =  {1,2, ...,n}  be  the  set  of  all  questions.  If  the  student 
has  probability  p.  of  answering  question  i  correctly,  and  if  all 
these  probabilities  are  independent,  then,  for  given  S  C  N, 


(1)  Pc(pls  ...,  p_)  =  n  p  n  (1-p  ) 

b   '       n    ieS  VS    ] 


is  the  probability  that  the  student  answers  all  the  questions  in  fields 
i  eS,  and  none  of  the  others,  correctly. 

Let  m,  now,  be  the  required  number  of  correct  answers.  If  so,  then 
the  student's  probability  of  passing  the  test  is 


=  £n 


(2)  F(Pl,  ...,  pn)  =  Z_.  n  p.  n  (i-p.) 

1       n    S  ieS  ]  i*S     n 

s_>m 

where  the  summation  is  taken  over  all  sets  S  with  at  least  m  elements 

The  student's  problem,  is,  then,  to  maximize  expression  (2)  subject 
to  (1)  and  the  budget  constraint 

(3)  2-fX-j  <_  a 

(4)  x-j  >_  0         i  =  1 ,  .. . ,  n 

where  a  is  the  student's  available  time.  The  first  order  conditions 
for  this  problem  are 


3F       dp 

(5>  ^7  ^7  =  x  if  xi  >0 


8F       dp. 
(6)  3p7     dx7     iX        if  Xi    =0 


In  the  general  case,  of  course,  this  presents  a  complicated 
computation.  We  will  consider  the  special  case  where 

(  x  0  <  x  <  1 

(7)  p.(x)  =  I  ~      ~ 

I  1  x  >  1  . 

This  is  not  an  unreasonable  probability  function:  it  represents  the 
case  where  the  student  requires  unit  time  (the  time  can  of  course  be 
suitably  normalized)  to  read  each  section  of  the  textbook.  In  less  than 
unit  time,  he  can  only  read  a  proportional  fraction  of  the  section,  and 
the  probability  of  a  correct  answer  is  in  turn  proportional  to  that. 


In  this  case,  the  first-order  conditions  take  the  form 


/  =  A    if 
I  <  A    if 


=  A    if    0  <  p.  <  1 


p.  =0 


Now,  it  can  be  seen  that 


(8) 


3F_ 

3p 


=  z  n 

S  jcs 

icS  j7i 

s=m 


JVS 


(1-P,) 


where  the  sum  is  taken  over  all  sets  S,  containing  i  and  exactly  m-1 
other  elements.  We  shall  use  F.  to  denote  this  partial  derivative. 

We  prove,  now,  that  we  need  only  consider  points  (p,,...,p  )  in 
which  each  p.  has  one  of  the  three  values  0,  1,  and  some  other  p. 

Lemma  1:  The  maximum  of  the  function  F  ,  subject  to  the  constraints 
(3)-(4),  is  attained  at  a  point  (p,,...p  )  whose  components  have  only 
one  value  other  than  0  or  1 ; 

Proof:  Let  us  consider  the  expression  (8)  for  F.  .  Lettinq 
&^i,  we  can  write  this  as 


f.=  2  n  p   n  (i-p  )  +  z   n  p  n   (i-P  ) 

1   S  jes  J  ,iVS    J    S   jeS  3   jtfS      J 
ACS       j*1       A/s       j^i 


where  the  first  sum  is  taken  over  all  S  with  AeS,  i^S,  s=m-l  , 
and  the  second  over  all  S  with  i,A£S,  s=m-l  .  We  rewrite  as 


F,-  =  Po 


z    n      p.    n  " 

(1-Pj)     +  (l-p£) 

S  jes       J   j^S 

m        j*t 

'z  n  p.  n  (i-p.) 


s  jcs  J  jtfs 


■M.I 


or  equi valently, 


(9) 


Fi  =  pa  I  ep  "^-"j' +  (1-p*>  *  V  n(1-pj' 


where  the  first  sum  is  taken  over  all  S  with  i,2,£S,  s=m-2  ,  and  the 
second  over  all  S  with  i,il^S  ,  s=m-l  .  In  each  case  the  first 
product  is  over  all  jeS  ,  the  second  over  all  j  e  N-S-{i ,£}. 
We  have,  then, 


Fr  F*  =  <p*  -  >i> 


z    np.  n(i-p.)  -  e  np.  n(i-p.) 

r      J         J      c       J  J 


where  the  two  sums  are  as  in  (9),  or  equivalently, 


(10) 


Fi  -  F£  "  ^i  -  Pi)  HU 


where 
(11) 


U 


Z 

S   jeS 
s=m-2 


n  p.  n 


jft. 


(i-Pj) 


n  p.  n  (i-Pj) 


S   jeS  J  jjfS 

s=m-l      j7i , 


where  the  sums  in  (11)  are  over  all  subsets  SCN-{i,&}  with  m-2 
and  m-1  elements  respectively.  We  note,  inter  alia,  that  H.„ 
depends  on  p.,  jYi,£,  but  does  not  depend  on  p.  or  p.. 

Let  X  be  the  set  of  all  p=(p.,...,p  )  which  maximize  F 
subject  to  (3)-(4).  By  continuity  of  F  ,  X  will  be  compact  and 
non-empty.  Then  C(X)  ,  the  convex  hull  of  X  ,  is  compact  and  convex; 
moreover,  the  extreme  points  of  C(X)  are  all  points  of  X  (though  not 
all  points  of  X  are  necessarily  extreme  in  C(X)).  We  claim,  now,  that 


if  p*=(p,*,...,p  *)  is  extreme  in  C(X)  ,  the  components  p. 
will  have  at  most  one  value  other  than  0  or  1 . 

In  fact,  suppose  there  is  some  pair  of  indices  i,£,  such  that 

0  <  pi*  <  p£  *  <  1- 

Since  peX  ,  then  by  (7-ii),  we  have 

F^p*)  =  F^(p*) 


Now,  by  (10) 


F1-F£-^£-P*J  Hii^  (P*> 


However,  p.*  <  p*   ,  and  so  we  must  have  H.   =  0. 

A/  I  Ay 

As  was  pointed  out  above,  however,  H.   is  independent  of  both 
p.  and  p   thus,  for  any  t  ,  the  point  p'(t)  ,  given  by 


p.'(t)  =  p.*  +  t 
p  '  (t)  ■  p  *  -  t 

p.'(t)  =  p.*  for  all  other  j 

J  J 


will  also  have  H  (p')  =  0.  For  sufficiently  small  t  (both 

I  X/ 

positive  and  negative)  p'(t)  will  satisfy  the  constraints  (3)-(4). 
Moreover,  the  directional  derivative  in  the  direction  of  increasing  t 
is  F.-Fg,  and  this  will  be  0  for  all  values  of  t.  Thus,  for 
sufficiently  small  t, 


F(p'(t))  =  F(p'(-t))  =  F(p*). 

Since  p*  maximizes  F  ,  so  do  p'(t)  and  p'(-t).  But  this  means 
both  p'(t)  and  p'(-t)  belong  to  X  ,  and,  since 


p*  =  j   (p'(t)  +  p'(-t)) 


we  conclude  that  p*  is  not  extreme  in  C(X)  .  This  contradiction 
proves  the  lemma. 

We  see,  then,  that  the  maximum  of  F  will  always  be  found  at  a  point 
of  the  form 

(  1     J*M 

(12)  p.  =   <  p    j«M' 

J     (  0     j£  M3 

where  M-. ,  Mo,  Mo  are  disjoint  sets  whose  union  is  N  ,  with 
cardinalities  m-j ,  mo,  and  mo  ,  while  0  <  p  <  1  .  We  have  then 

(13)  m,  +  m?  +  mo  =  n. 

(14)  m,  +  m-p  =  a. 

It  is  easy  to  see  that,  in  this  case,  we  will  have 

mO  Irr.      \  £ 


(15)  F  =  I2  (m?)    pS  (1-p) 


s=m-m. 


In  fact,  all  members  of  M,   are  always  correct,  and  all  members  of 
M^  are  always  wrong.  Thus  the  student  will  pass  the  exam  if  and  only 
if  at  least  m-m,  of  the  members  of  M?  are  answered  correctly. 


£*^">n, 


Lemma  2:  If  a>m,  then  F  is  maximized  by  setting  m-.>m.   If 
then  F  is  maximized  by  setting  m,=0,  i.e.,  M  =0. 

Proof :  if  a  _>  m,  it  is  easy  to  see  that  F  can  be  made  equal  to  1 
simply  by  letting  m,>m.  This  is  clearly  a  maximum. 

Suppose,  in  fact,  that  a<m,  but  M-.t'CL  Then  m,<_a<m,  so 
m9>0  as  otherwise  we  would  have  F=0  .  Let  ieM,,  &eM  •  then 
p.  =  l  and  0<p,,<  1  ,  so  assuming  p  is  optimal,  we  must  have 


Fi±Fr 


Now,  however, 


m0  \  m-m,     m-,+  m0-m 


i  I  m-r 

(m9-l  \  m-m, -1 
i   P      (1-P) 
m-m-,-1  /  K      v  y' 


m,+m9-m 


(since  as  we  saw  before,  F.  is  simply  the  probability  that  exactly  m-1 
answers  other  than  j  be  correct).  Thus  we  have 


\  m-ml  / 


m-m,     m, +m0-m  ^  /  2   \  m-m,-l     m,+m9-m 

p    ]  (i-p)  ]   2    -  Vm-mrV  p         h-p)  ]   2 


which  reduces  to 


m^p     -j 
m-m-,  — 


or 


m?p  _>  m-m. 


By  (14),  however,  this  gives  us  a>m  which  is  a  contradiction.  Thus, 
if  a  <  m,  then  at  the  optimum,  M-.  =  0  as  claimed.  Q.E.D. 

From  Lemma  2  we  see,  then,  that  in  the  "difficult"  case,  a  <  m,  we 
have  m,  =  0.  Denote  M?  by  K,  then  M~  =  N-K,  and  so  the  optimum 
will  be  obtained  at  a  point 


0   if   j  t   K 


where  K  has  k  elements.  In  this  case 

k   /.  \  /  \  s  /,   \  k-s 


F  = 


S  0  ©  M 


and  we  look  for  the  value  of  k,  m  <_  k  <_  n,  which  maximizes  this  expression: 

In  general,  we  can  obtain  this  number  from  tables  of  the  cumulative 
normal  distribution.  To  get  an  idea  of  its  behavior,  however,  we  let 


<">  \u  -  (s)  (t) S   (¥) 


k-s 


be  the  probability  of  exactly  s  correct  answers,  assuming  that  the 
student  divided  his  time  among  k  sections.  Then 


qk(s)      k     (k-l)k_1      (k-a)k_S 


qk_-,(s)     k-s       kk       (k-a-l)k"s_1 


As  a  ■*  0,  this  expression  approaches  the  limit 


(19)  L„(s)  =  -i-  (*=?) 


"k       k-s 


Now,  it  is  easy  to  see  that,  for  k  >  0,  and  s  >  1, 


1  -^< 

k 


C  -  H 


and  so  L.(s)  >  1  for  s  >  1.  We  conclude  that,  for  small  values  of 
a,  q,  (s)  >  \   -i(s)  f°r  all  k  and  all  s  >  1,  and  so  k  should 
be  chosen  as  large  as  possible,  i.e.  k  =  n.  On  the  other  hand,  if  a 
is  large,  i.e.,  sufficiently  close  to  m,  we  know  it  is  best  to  choose 
k  =  m. 

We  conclude,  then,  that  for  small  a   the  student  should  study  some 
of  each  section;  for  large  a  (i.e.,  near  m)  he  should  concentrate  his 
studying  on  m  of  the  sections.  What  is  not  clear  is  (a)  whether  any 
intermediate  values  of  k  (i.e.,  m  <  k  <  n)  are  ever  optimal. 

To  look  at  this  problem  in  some  detail,  we  consider  the  case  n  =  13, 
m  =  7.  Figure  1  shows  the  result  of  our  calculations:  k  =  13  is  optimal 
for  all  a  <  6.16,  while  k  =  7  is  optimal  for  a  >  6.30.  In  between 


10 


there  seem  to  be  five  small  siibintervals  where  k  =  12,  11,  10,  9,  8  are 
successively  optimal. 

It  is  not  clear  whether  this  type  of  behavior  always  holds,  though  in 
the  several  cases  studied  by  the  authors  this  is  indeed  the  case.  If  we 
look  at  expression  (18),  we  note  that,  as  a  function  of  a,  these  ratios 
are  convex,  i .e., 


92 


8<7    ^k_l(s 


This  suggests  (though  it  does  not  prove)  that  this  type  of  behavior  will 
usually  hold. 

For  small  values  of  m,  it  is  not  difficult  to  show  that  this  is 
indeed  the  case.  For  example,  in  the  case  n=3,  m=2,  we  find  k=3  is 
optimal  for  a  <_  1.125,  with  k=2  optimal  if  a  >  1.125. 

For  n=5,  m=3,  we  find  that  k=5  is  optimal  if  a  <  2.117;  k=4  is 
optimal  for  2.117  <_  a  <_  2.173;  finally,  k=3  will  be  optimal  if  a  > 
2.173. 

The  number  of  correct  answers—assuming  all  study  was  concentrated 

on  k  sections  of  the  course — is  a  binomial  random  variable  with 

ot 
parameters  k  and  r-  ;  its  mean  is  therefore  a,  and  its  variance  is 

a(l-j^).  For  large  values  of  m  and  n,  this  can  generally  be  approximated 

by  using  either  the  normal  or  the  Poisson  distribution. 

If  a  is  close  to  m,  say  a  =  m-X.  Then  letting  k=m,  we  would 

have 
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1  -p   =  1  -i    for  j€K 


and  so  the  number  of  incorrect  answers  among  the  m  sections  studied  is 
a  binomial  variable  with  mean  X.  If  we  use  the  Poisson  approximation, 
the  probability  of  r  incorrect  answers  will  be 

In  particular,  the  probability  of  passing  the  exam  is  Q(0),  or  P  . 

As  against  this,  if  the  student  studies  m+1  sections,  the  number  of 
incorrect  answers  among  the  sections  studied  will  also  be  approximately 
Poisson  with  mean  X  +  1.  To  pass,  at  most  one  can  be  incorrect;  the 
probability  of  passing  is  then 


Qx+1(0)  +  Qx+1(l)  =  rX+1(l+X+l) 


and  this  will  be  greater  than  P~   only  if  X>&-2,  i.e.,  if  o<M+2-£,  or 
about  a<M-0.718.  Thus  k=M  is  optimal  if  o>M-0.718. 

Suppose,  on  the  other  hand,  a  is  considerably  smaller  than  M.  In 
this  case  concentration  on  k  sections  gives  us  a  binomial  variable 
which  can  best  be  approximated  by  a  normal  variable  with  mean  a  and 
variance  a(l-r-).  To  pass  the  examination,  the  student  requires  at  least 
m  correct  answers,  i.e.,  the  variable  must  have  a  value  at  least  equal  to 
m-*-  (the  fractional  modification  is  standard  in  such  cases).  If  a, 
the  mean  of  the  variable,  is  more  than  slightly  below  m--*-,  this  probability 

will  be  maximized  by  making  the  variance  as  large  as  possible.  With 
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a  fixed,  this  is  done  by  setting  k  as  large  as  possible,  i.e.,  k=n 
The  probability  of  passing  the  exam  will  then  be  given  by 

1 
p  =  $  /    a"m+2 


Va(l-a/n), 

where  $  is  the  cumulative  standard  normal  distribution  function. 

One  interesting  observation  remains  to  be  made,  and  it  concerns  the 
person  who  makes  up  the  exam.  If,  instead  of  asking  one  question  on  each 
section  of  the  course,  he  were  to  choose  n  questions  at  random 
(independently)  from  the  entire  subject  matter  of  the  course,  then  the 
student  who  devotes  a  units  of  time  (where  n  units  would  be 
required  to  know  the  entire  subject)  would  have  probability  a/n  on  each 
question.  In  effect,  this  is  the  same  as  if  the  student  had  devoted 
a/n  units  to  each  of  the  n  sections  of  the  course.  But  we  have  seen 
that  this  is  precisely  the  optimal  study  strategy  for  the  student  who 
spends  a  relatively  small  time  preparing  for  this  exam.  Thus,  such  a 
strategy  on  the  part  of  the  examiner  will  penalize  only  the  students  who 
spend  a  relatively  long  time  preparing,  i.e.,  the  conscientious 
students.  In  other  words,  the  student  who  knows,  e.g.,  80%  of  the  course 
material  will  get  a  grade  of  80%  if  there  is  one  question  from  each 
section,  but  might  fail  if  the  questions  are  chosen  randomly  from  the 
entire  course  matter.  The  student  who  knows  only  30%  of  the  course 
matter  has  the  same  probability  of  passing  under  either  mode  of 
examination. 
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